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Abstract —We describe a convex programming framework for 
pose estimation in 2D/3D point-set registration with unknown 
point correspondences. We give two mixed-integer nonlinear 
program (MINLP) formulations of the 2D/3D registration prob¬ 
lem when there are multiple 2D images, and propose convex 
relaxations for both of the MINLPs to semidefinite programs 
(SDP) that can be solved efficiently by interior point methods. 
Our approach to the 2D/3D registration problem is non-iterative 
in nature as we jointly solve for pose and correspondence. 
Furthermore, these convex programs can readily incorporate 
feature descriptors of points to enhance registration results. We 
prove that the convex programs exactly recover the solution to 
the MINLPs under certain noiseless condition. We apply these 
formulations to the registration of 3D models of coronary vessels 
to their 2D projections obtained from multiple intra-operative 
fluoroscopic images. For this application, we experimentally 
corroborate the exact recovery property in the absence of noise 
and further demonstrate robustness of the convex programs in 
the presence of noise. 

Index Terms —Rigid registration, 2D/3D registration, iterative- 
closest-point, convex relaxation, semidefinite programming. 


I. Introduction 

R IGID registration of two point sets is a classical problem 
in computer vision and medical imaging of finding a 
transformation that aligns these two sets. Typically, a point-set 
registration problem consists of two intertwined subproblems, 
pose estimation and point correspondence , where solving one 
is often the pre-condition for solving the other. A canonical 
formulation of the rigid point-set registration problem for two 
point clouds is the following. Let 1,7 G M dxrn be two point 
sets in dimension d , we want to solve: 

min \\RX-YPfp, (1) 

Pen™* m ,Resold) 

where II™ xm is the set of m x m permutation matrices, and 
R G SO (d) is the rotation matrix from the special orthogonal 
group in d dimension. Finding a solution to this problem is 
difficult as it is a nonconvex, mixed integer nonlinear problem. 
Another close relative of this problem, namely the 2D/3D point- 
set registration, assumes a 3D point-set and 2D projections of 
the 3D point-set. The objective is to find out the pose of the 
3D model that gives rise to the 2D degenerate point-set upon 
projections. This adds an additional complication to that of 
the regular point-set registration problem, namely, the loss of 
depth information. In this paper, we focus on solving forms of 
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2D/3D point-set registration problem described in Section III 


with guarantees using our semidefinite programs conregl 
and conreg2. We further demonstrate their usefulness in the 
application of coronary vessel imaging. 

From a broader perspective, 2D/3D point-set registration 
problem arises in numerous medical imaging applications in 
fields such as neurology [ 11, orthopaedics [2|, and cardiol- 
ogy The associated body of literature on 2D/3D registration 
is expanding rapidly, as is apparent in the thorough review of 
techniques recently published by Markelj et al. 0. Amongst 
the vast literature on the 2D/3D registration problem, we briefly 
discuss the techniques most closely related to the one presented 
in this paper, namely, methods for registration of point-sets. 

As mentioned earlier, a typical point-set registration prob¬ 
lem consists of two mutually interlocked subproblems, pose 
estimation and point correspondence. The key idea in seminal 
Iterative Closest Point (ICP) |5j algorithm is to alternatively 
solve the two subproblems starting from an initial estimate of 
pose. For the correspondence subproblem, it uses closeness in 
terms of Euclidean distance between two points to determine 
whether they correspond to each other. There are many 
variants proposed to enhance the original I CP, either to 
make fast correspondence © or to include more features to 
obtain high quality correspondence Q. Variants for 2D/3D 
registration include ©-©• Though popular, these methods 
suffer for common drawbacks, namely, they are all local 
methods and do not guarantee global convergence. Their 
performances all rely on a good initialization and the spatial 
configuration (distribution) of 3D points. In 2D/3D point-set 
registration, for this type of local optimizers several strategies 
have been proposed to increase the capture range by the use 
of multi-resolution pyramids fl2| , use of stochastic global 
optimization methods such as differential evolution 0 or 
direct search wherein the parameter space is systematically 
and deterministically searched by dividing it into smaller and 
smaller hyperrectangles 0 - In these cases, except in direct 
search, the guarantees on finding the correct global minima 
are very weak. In the case of direct search one requires the 
parameter space to be finite. 

Another line of work G3-G3 in point-set registration 
focus on using soft or inexact correspondences to enhance 
the search for global optimizers. In a recent work [ 16], the 
Oriented Gaussian Mixture Model (OGMM) method is proposed 
to extend the successful Gaussian mixture based nonrigid 
registration algorithm (gmmreg) ED to the 2D/3D case. These 
methods model the point configuration as Gaussian mixtures, 
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and they intend to find a transformation that maximizes the 
overlap between these distributions. The structure of Gaussian 
distribution encapsulates the idea of soft correspondence and 
enables fast implementation. Empirically, as more fuzziness 
is allowed for point correspondences (larger variance for the 
Gaussians), the target function of optimization is smoother and 
hence it is less likely to find a local optimum. Nevertheless, a 
good initialization is still crucial for these types of algorithms 
due to the nonconvex nature of the cost. 

Other than local optimization approaches to the registration 
problem, efforts have been put forth to exhaustively search 
for the global optimum using methods such as branch and 
bound ED- While such techniques can be fast in practice, in 
principle registration problems that require exponential running 
time can exist. In this paper, we introduce a different approach 
to jointly and globally solve the pose and correspondence 
problem in 2D/3D point-set registration by relaxing the hard 
registration problem to convex programs. Unlike local methods, 
the solutions of convex programs do not depend on the 
initialization. We prove for certain noiseless situations the 
global optimizer of the surrogate convex problem coincides 
with the global optimizer of the original problem. Therefore 
through convex programming we are guaranteed to attain the 
global optimum in polynomial time. In the noisy regime where 
the solution of the convex programs closely resembles the 
optimizer of the original problem, the proposed method will 
converge to the neighborhood of the original optimizer. In 
these situations, it is practically advantageous to use convex 
optimization instead of greedy methods or global optimization 
that may run in exponential time. From a theoretical point 
of view, the convex relaxation framework can be used to 
characterize the polynomial time solvable cases in 2D/3D 
registration with unknown correspondences, through analyzing 
the exact recovery condition mathematically. 

Before proceeding to the rest of the sections, we note 
that the correspondence estimation sub-problem, which is 
typically treated as the quadratic assignment problem, is NP- 
hard on its own. Therefore the solution to our proposed convex 
programs may not always be close to the solution of the 
joint pose and correspondence estimation problem. There are 
certainly cases for which the use greedy approaches is the 
only resolution, though not globally optimal. However, we 
hope the promising results demonstrated by the proposed novel 
non-iterative approach towards 2D/3D registration will attract 
proposals with tighter convex relaxations. The shortcomings 
of our approach are discussed in the concluding remarks. 

A. Our contributions 

The main contributions of this paper can be summarized as 
the following: 

Algorithm: The original 2D/3D point-set registration of a 
3D model and multiple 2D images is formulated as nonconvex 
MINLPs that are difficult to solve. We propose two convex 
relaxations for these MINLPs. The programs jointly optimize 
the correspondences and transformation, and the convex 
nature of these programs enable efficient search of global 
optimum regardless of initialization using standard off-the- 
shelf conic programming software. Furthermore, one of the 


convex programs conreg2 gives solution to a variant of 
2D/3D registration problem where we simultaneously estimate 
the point correspondences between multiple 2D images while 
respecting the epipolar constraint. This could be utilized in 
finding the corresponding image points for 3D reconstruction 
purposes |I8) . Another natural extension of these programs is 
the incorporation of local descriptors of points as additional 
terms in the objective of both the programs. For our clinical 
application, we use tangency of the vessel in the local neigh¬ 
borhood of the point to illustrate the use of point descriptors. 

Exact recovery analysis: We prove exactness of the convex 
programs, that is, under certain conditions the proposed relaxed 
convex programs will give a solution to the original MINLPs. 
We prove that under noiseless situation the relaxed convex 
programs are in fact able to exactly recover the rotation 
and permutation matrices that match the projected points to 
the 3D model. Our simulations show that algorithms’ global 
convergence results also hold with noise when the error in 
recovery grows nearly proportionally to the added noise. Real- 
data examples corroborate the theoretical results, and suggest 
potential applications in coronary tree matching. 

Here we outline the organization of this paper. In SectionJII] 
we summarize the notation used in the paper. In Section [in] 
we describe the type of 2D/3D registration problems we intend 
to solve and present two MINLPs formulations when there 
are multiple images. In Section IV we present the convexly 
relaxed versions of the 2D/3D registration problem in terms of 
tractable semidefinite programs. In Section [V] we prove that 
achieving global optimality is possible under certain situations. 
In Section |VI-A[ we mention how to incorporate additional 
features to the convex programs to enhance registration results, 
in particular in the application of coronary vessel imaging. 
Lastly, in Section VII we empirically verify the exact recovery 
property, and demonstrate the robustness of the algorithm on 
simulated and real medical datasets for the registration of 2D 
coronary angiogram with the 3D model of the coronaries. 


II. Notation 

We use upper case letters such as A to denote matrices, and 
lower case letters such as t for vectors. We use Id to denote 
the identity matrix of size d x d. We denote the diagonal of a 
matrix A by diag(A). We use A n for integer n > 1 to denote 
the multiplication of A with itself n times. We will frequently 
use block matrices built from smaller matrices, in particular 
when we deal with problem (REG2). For some block matrix 
A, we will use A{j to denote its (i,j)- th block, and A(p,q) 
to denote its (p, q)- th entry. We also use A t to denote the i- th 
column of A. When we have an index set s = {si,..., s n } 
where each Si is an integer, we use A s to denote the matrix 
[A Sl ,... ,A Sn \. We use A >z 0 to mean that A is positive 
semidefinite. We use IMh to denote the Euclidean norm of 
x £ M n . We denote the trace of a square matrix A by Tr(A). 
We use the following matrix norms. The Frobenius, mixed 
and entry-wise i\ norms are defined as: 

\\A\\ F = Tr(A T A) 1 / 2 , \\A\\ 2j1 =]r|MMl2, and 
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Piii = EEwu)i- ( 2 ) 

i 3 

The Kronecker product between matrices A and B is denoted 
by A® B. The all-ones vector is denoted by 1. We use \s\ 
to denote the number of elements in a set 5 . For a set s we 
use conv(s) to denote the convex hull of s. We introduce the 
following sets, 

II“ xb = {A e {0, l} axb :^2A(i,j) = 1 ,^A(i,j) = 1}, 

1=1 j=1 

n r b { 0 , l} axb :J2A{i,j) = l,^A(i,j) < 1}, 

i =1 3 =1 

a b 

n axb = {j 4 g {0, l} axb < 1}, 

i=l j=1 

and we frequently call them the permutation, left permutation 
and sub-permutation matrices. 

III. Problem statement 

A 3D centerline representation of a coronary artery tree, 
segmented from a preoperative CTA volume, is to be registered 
to N fluoroscopic images. Our observed 3D model with m 
points is described by matrix X G M 3xm . The projection 
operators between the 3D space and the i-th image, represented 
by 4/*, are known from the calibration of the X-ray apparatus. 
The projection operator maps the 3D model to a degenerate 
point cloud Xi G M 3xni that represents i-th projection image. 
By degenerate we mean the affine rank of 2 \ is two. We will 
assume that rii > m, 1 < i < N. The 2D/3D registration 
problem is to find an alignment matrix R , in the special 
orthogonal group SO(3), that matches some permutation of the 
observed 3D model with each of the degenerate projections. 

We propose to solve the multiple images 2D/3D registration 
problem as: 

N 

(REG1) min V \\*iRX - XP || 2jl , 

RES 0 ( 3 ), 

x i= 1 

PiGn?* 

where || • || 2,1 is the mixed ^ 2/^1 norm. We do not know a 
priori the correspondences between points in the 3D model 
and projection image (which are encoded in Pi for each of N 
given images), and the rotation R. These are found by solving 
(REG1). Intuitively, the minimization of the cost in (REG1) 
simply ensures by subsampling and permuting X z , the image 
points should as close as possible to the projections of the 3D 
model X posed by some rotation R. We employ || • || 2,1 norm 
in order to alleviate the costs due to the outliers. If we consider 
a maximum likelihood estimation framework with Gaussian 
type noise, a squared Frobenius norm || • |||, could be replaced 
instead. 

Next we consider a variant of problem (REG1) where the 
correspondences of the points between the two images are 
available. Such correspondences could come from feature 
matching, or by exploiting epipolar constraints. This basically 
means P z need to be optimized dependently to preserve 
correspondences between the images. Let the correspondence 


of points between X a and 2 ^ be denoted by P a &, where 
Pab{i,j) = 1 if point i in X a corresponds to point j in X 5 and 
zero otherwise. We do not require one-to-one correspondence 
between points such as each row or column of P a b can 
have more than one nonzero entries. In the presence of such 
correspondence information, R can be obtained as the solution 
to the following optimization problem: 

N N N 

(REG2) min E W*iRX-liPih, !+EE \\Pab-Pab\\l 

( Pi,p ab )es i=1 a=1 b>a 

where P a b is the permutation matrix that relates the points 
of X a and 2^ for N(N — l )/2 image pairs, and || • ||i is the 
entry-wise t\ norm. The domain of optimization for the S will 
be explained in the next subsection. We note that solving this 
problem could also be useful if we are interested in estimating 
the correspondence between the N(N — l)/2 image pairs 
directly for reconstruction purpose. 

For the remainder of this paper we only consider two images 
to simplify notation. It should be obvious the solution we 
propose in the subsequent sections can be easily extended 
to the multiple images case. Problems (REG1) and (REG2) 
have nonconvex domains which consist of integer and rotation 
matrices, as thus are very difficult to solve. In subsequent 
sections, we formally define these domains and present convex 
relaxation that can recover the exact solution under certain 
conditions. 

A. Domains for permutation matrices 

A more formal way to understand the problems (REG1), 
(REG2) is the following: Suppose there is a ground truth 3D 
model with m! points which is described by the coordinate 
matrix X gt G M 3xm . Our observed 3D model is described by 
matrix X G M 3xm . Assume m < ni, ri 2 < m!. In this case, 
we have 

Xi = *i RX gt Q u X 2 = RX gt Q 2 , (3) 

X = X gt Q s , (4) 

where R , ^ 1 ,^ 2 , 21,22 are matrices as introduced as before, 
Qi G n f xni ,Q 2 e Uf xn2 and Q s G n^ xm . One can 
intuitively regard Qi,Q 2 ,Q 3 as operators that generate the 
images and the observed model by sub-sampling and permuting 
the points in ground truth 3D model X gt . For example, if the 
zth row of Q 1 is zero, then it means the i -th point of X gt (or 
more precisely, i-th column of X gt ) is not selected to be in 

2 i. 

Here we make an assumption, that if a point i in X gt (the i -th 
column of X gt ) is contained in X , then the projections of point 
i must correspond to some columns of 2i, 2 2 . Loosely speaking, 
it means the observed 3D model X is a subset of the point 
clouds 2 i, 2 2 . In this case we know Pi = Q 1 Q 3 G n^ lXm 
and P 2 = Q 2 Q 3 £ n™ 2Xm . Furthermore, if a point i of 
Xg t is not selected by Q\ (meaning JE Qi(i,j) = 0), then 
Q 1 Q 1 G M nxn is almost an identity matrix J n , except it is 
zero for the ith diagonal entry (similarly for Q 2 Q 2 )• We use 
these facts after multiplying the equations in ^ from the right 
by QiQz,QlQz to get 

V 1 RX = I 1 P 1 , 


'&2RX = X^P'i- 


(5) 
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When there is no noise in the image, the equations in (3) have 
to be satisfied. If not we turn to solve the optimization problem 
(REG1). 

To simultaneously estimate the correspondence Pi 2 , we use a 
similar construction as [19] and [20]. Let Pi 2 = QjQ 2 - When 
having exact correspondence matrix Pi 2 without ambiguity, 
we require 

Pi2 = Pi 2 . (6) 

We relate Pi, P 2 , P 12 through a new variable 

P = [Ql Q 2 QsfiQl Q 2 Q 3 ] e R (n 1 + n 2 +m)x(n 1 + n 2 +m)_ 

(7) 

In this context, the variables Pi,P 2 used in (REG1), along 
with Pi 2 are simply the off diagonal blocks of the variable P. 
Then the domain of (Pi,P 2 ,Pi 2 ) is simply 

5 = {(Pi,P 2 ,Pi 2 ): 

P P 0, rank(P) < n, Pa = I n ., 

Pi g n™ iXm ,p 2 g n^ 2Xm ,Pi 2 g n niXn2 }. (8) 

P vl should be understood as the diagonal blocks of P. 
Again, we solve the optimization (REG2) when images or 
correspondences are noisy. This formulation ensures that the 
left permutation and sub-permutation matrices Pi,P 2 ,Pi 2 are 
cycle-consistent. This basically means the relative permutation 
matrices Pi, P 2 , P 12 all can be constructed by some underlying 
global “sampling” matrices Qi, Q2, Q3 for each image and the 
3D model (see (20) for a detail description). 

IV. Convex relaxation 

In this section, we propose two programs, which are the 
convex relaxations of problem (REG1) and (REG2). We replace 
the non-convex constraints that prevents these problems from 
being convex to looser convex constraints. These relaxations 
enables the efficient search of global optimum using standard 
convex programming packages. To solve the convex problems 
we use CVX, a package based on interior point methods (2l) 
for specifying and solving convex programs (22) , (23) . Before 
looking at each of the two problems, we introduce a result 
from (24) to deal with the nonconvexity of §0(3) manifold. 
In (24) the authors give a characterization of the convex hull 
of SO (d) for any dimension d , denoted as conv(§©(cf)), in 
terms of positive semidefinite matrix. The d = 3 version will 
be restated in the following theorem. Due to space limitation, 
for this theorem we locally define the (i,j)-th entry of a matrix 
X as Xij instead of X(i,j). 

Theorem 1 (Thm 1.3 of |24|). conv(SO(3)) = {X e R 3x3 : 

1- x 11~ x 22+ x 33 X 13+ X 31 X 12 ~ X 21 X 23+ X 32 

: * 13+*31 l~h x 11 — x 22 — x 33 x 23~ x 32 x 12+ x 21 

X 12 -X 21 x 23~ x 32 l + x ll+ x 22+ x 33 x 31~ x 13 

X 23+ X 32 X 12+ X 21 X 31 ~ X 13 1 ~ X 11 + x 22 ~ x 33 

is positive semidefinite }. 

Such linear matrix inequality is convex and can be included 
in a semidefinite program. For both problems (REG1) and 
(REG2), we relax the domain of P, which is §0(3) into 
conv(§©(3)). However, this relaxation alone is not sufficient 
to turn (REG1) and (REG2) into convex problems, and we 


lay out the details of the fully relaxed problem in the next 
two subsections. We present the relaxations in the case of two 
images, and the generalization to more than two images should 
be obvious to the reader. 


A. Relaxing (REG1) 

We remind readers that in problem (REG1) we find a rotation 
to register the 3D model with the images. The points in the 3D 
model X and Pi, P 2 should be matched after a permutation 
operation through the variables Pi, P 2 . Pi, P 2 have their entries 
on the domain {0,1}. It is difficult to optimize on such discrete 
domain. A typical way to bypass such difficulty is by the 
following discrete-to-continuous relaxation, 

Pi G {0,l} n * xm P i g [0, l] n * xm (9) 

for i = 1, 2. The relaxed matrices Pi and P 2 can be interpreted 
as probabilistic map between the points in the image and 
the points in the 3D model. For example, the size of an 
entry Pi(z, j) resembles the probability of point i in Pi being 
corresponded to point j in the 3D model. Such relaxation has 
been used for problems such as graph isomorphism (25) . 

We now introduce the convexly relaxed problem conregl, 

(conregl) 

min \\*iRX - Pi Pi || 2> i + || ^ 2 RX - P 2 P 2 || 2 ,i 
R,Pl,P 2 

s.t R G conv(SO(3)), 

Pi G [0, lp xm , 

Pil < 1, 1 T Pi = 1, for iG {1,2}. 

The last constraint simply follows from the definition of 
left permutation matrix in the notation section. In case when 
there are N > 2 images, there should be N matrix variables 
Pi,..., Pat for z G {1, • • • , TV} with constraints as indicated 
above. 


B. Relaxing (REG2) 

In problem (REG2) we include the estimation of P 12 , 
the correspondence between image Pi,Pi, on top of pose 
estimation. In this case, we relax the §0(3) and the integrality 
constraints just like the case of problem (REG1). However, 
different from (REG1), the domain of optimization for (REG2) 
includes the set S in ([8), which has an extra rank requirement 
that prevents the convexity of (REG2). A popular way of 
dealing with the rank constraint is by simply dropping it. Such 
relaxation is common in many works, for example in the 
seminal paper of low rank matrix completion (26) . We now 
state the convex problem conreg2 
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(conreg2) 

min \\ViRX - hPi || 2 ,i + ||^ 2 RX - Z 2 P 2 || 2 ,i 
R,Pl,P 2 ,Pl2 

+ IIP12-P12II1 
s.t R E conv(SO(3)), 

Pi € [0,lp xm , for i E {1,2}, 

P 12 € [0, l] niX " 2 , 

Pi 1 < 1, 1 T Pi = 1, for * E {1,2}, 

P12I < 1 , l T Pi2 < 1 , 


4 , 7 . -| 

P12 

Py .2 

I'n -2 

Pi 

Pi 


Pi 

Pi 


> 0. 


Again, conreg2 inherits the equality and inequality con¬ 
straints for s and Pi 2 from the the property of left permu¬ 
tation and sub-permutation matrices. In case when there are 
N > 2 images, there should be N matrix variables Pi,..., Pat 
and N(N — l)/2 variables P^ . The relevant constraint that 
encodes cycle consistency in such case should be 


Ini 

Pl2 

• * * Pin 

Pi' 

PI 

Iri2 

• * * P 2 N 

P 2 

p In 

P 2N 

In N 

Pn 

PI 

PI 

■ p t n 

Im _ 


C. Connection to classical registration problem in 3D 

It is obvious that similar relaxation can be derived for the 
classical rigid registration problem of Q- For the sake of 
completeness, we state the convex formulation to the 3D 
registration problem in Q- 

min \\RX-YP\\l 
s.t R £ conv(SO(3)), 

PE [0,l] mXm ,Pl = 1,1 T P= 1. (11) 

Again, in applications where robustness is a requirement, a 
mixed ^ 2/^1 can always be replaced instead. The discussion 
of this program is out of the scope of this paper and will be 
included in a future publication of the authors. 


simply by finding the closest orthogonal matrix to R in terms 
of Frobenius norm, namely 

R * = argmin||P — R\\ 2 F (12) 

-R£©(3) 

where 0(3) is the orthogonal group in 3D. For completeness 
we provide the analytical expression for R* . Let the singular 
value decomposition (SVD) of R be of the form 

R = UY>V t . (13) 


Then 


R* = UV T . 


(14) 


The reader maybe curious that the special orthogonality 
constraint is not enforced in the rounding procedure. This 
is indeed unnecessary since an element R in conv(§©(3)) has 
positive determinant [24], hence UV T obtained from the above 
procedure necessarily is in §0(3). 

As we see in the case when the solutions of conregl, 
conreg2 are not feasible for (REG1) and (REG2), a rounding 
procedure is required to project R * to a nearest point in 
§0(3). Although it is possible to prove ||P* — P 0 | being 
small (as in other applications of convex relaxation, see [ 27 1 
for example), R* in general is still suboptimal in terms of the 
cost of the original nonconvex problem (REG1) and (REG2), as 
the rounding procedure does not consider optimizing any cost. 
It is thus quite typical to use the approximate solution from 
convex relaxation as an initialization and search locally for a 
more optimal solution, for example in (28j. In this paper, we 
use the OGMM algorithm, which searches locally by optimizing 
a nonconvex cost using gradient-descent based method. 


V. Exact recovery 

In this section, we will give proof that the two relaxed convex 
programs exactly recover the solution to the original problems 
(REG1) and (REG2) under certain situations. We will show 
that the solutions returned by our algorithm conregl and 
conreg2 lie in the domain of (REG1) and (REG2), hence 
showing the attainment of global optimizers of (REG1) and 
(REG2), as the domain of (REG1) and (REG2) are properly 
contained in the domain of conregl and conreg2. By this 
we mean exact recovery. The proofs of exact recovery for 
conregl and conreg2 are almost the same. We will go 
through proving exact recovery for conregl in detail, and 
just state the results for exact recovery of conreg2. 


D. Projection and local optimization 

Although in section [V] we prove that under certain situations 
the two convex programs give solution to (REG1) and (REG2), 
this will not always be the case as the domain of search has 
been made less restrictive. As we are concerned about posing 
the 3D model in this paper, we describe a strategy in this 
section to obtain an approximate solution from conregl and 
conreg2 for the rotation in general cases. Denote the optimal 
“rotation” in the domain conv(§©(3)) of both programs as 
R. As R is not necessarily in §0(3), we need a strategy to 
project R to a point R* on §0(3). The strategy we use is 


A. Our assumptions 

We summarize the assumptions for our proof. 

(1) We consider the situation when rt\ = n<i = m = m !, 
which means that all the image points X\ , X 2 come from the 
projection of the 3D model. 

(2) We consider the noiseless situation when the equations 

hPi = 1 RX, X 2 P 2 = ^2 RX, (15) 

can be satisfied, where Pi,P 2 E II™ xm and R £ §0(3). We 
will call these variables the ground truth. Notice that under 
assumption (1), P\, P 2 are permutation matrices instead of 
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left-permutation matrices. In the case of conreg2, we further 
assume 

Pi2=Pi2- ( 16 ) 

In order to show exact recovery, we want to show that the 
solution P, A,A to conregl, is in fact P, Pi, A- For 
conreg2 we in addition show A 2 = Pi 2 - 

(3) Without lost of generality we can consider R = I 3 and 
Pi, A = Im- Such consideration is backed by lemma [I] 

(4) The m points are in generic positions. A set S = 

{si, 52 ,..., s n } of n real numbers are called generic if there 
is no polynomial h(x 1 ,..., x n ) with integer coefficients such 
that , s n ) = 0 129]. When we say the points are in 

generic positions , we mean the set of 3 m real numbers from X 
is generic. This assumption excludes the possibilities that three 
points can lie on a line or four points can lie on a plane. It 
also implies the centroid of any subset of the points is nonzero. 
Furthermore, no two subsets can share the same centroid. 


are in one-to-one correspondence with solutions to conregl 
with 3D model RX and image ZiAj^hA- Therefore without 
loss of generality, using the change of variable introduced in 
(17), we can let the 3D model being X' = RX , the images 
being X[ = X 1 P 1 = RX = and X' 2 = 4/ 2 2f'. We 

thus consider solving the problem conregl in the following 
form 

min W^iRX' - ^iX'PAUi + II ^ 2 RX' - 

R,P 1 ,P 2 >11 11 

s.t R G conv(SO(3)), 

Pi e [o,ip Xm , 

Pil < 1,1 T Pi = 1,for i E {1,2}. 

( 21 ) 

In this case, instead of showing R = P, Pi = A, A = A, 
it suffices to show R = I 3 and Pi, A = Im being the unique 
solution to conregl. 


(5) There are at least four points, i.e. m > 4. 

(6) Without lost of generality, we consider 4>i[0 0 1] T = 

Lemma 2. 




4^2 [0 0 1] T = [0 0 1] T (the directions of projection are 
perpendicular to z-axis). 


^xRX’ 

•ii 2 RX' 

= HfiX’p! 

= $ 2 X'P 2 

(22) 

(23) 

We note that although we only give proof of exact recovery 


under these assumptions, in practice it is possible for our 
algorithm to recover the pose exactly when 711,712 > m. 

and 

V 1 R n X' 

= T' 1 X'P 1 n 

(24) 

B. Exact recovery property of conregl 


'H 2 R n X' 

= T' 2 X , P 2 ” 

(25) 

Let the outcome of the optimization conregl be P, Pi, A- 

for any n > 1. 




Theorem 2. Under our assumptions, the solution P, A, A to 

Proof See appendix |A| 


□ 


conregl is unique, and R = P, Pi = Pi, P 2 = P 2 . 

When we have exact recovery, P* obtained from the 
projection of P to §0(3) is simply P, as P is already in 
§0(3). To prove the theorem, we first state a couple lemmas. 

Lemma 1. If we make the following change of data 

X^X\ X 2 ^p', (17) 

where 

X' = RX, X[ =XiA, x 2 =X 2 A, (18) 

then the solutions of conregl are changed according to the 
invertible maps 


Lemma 3. Pi, P 2 are doubly stochastic matrices and it suffices 
to consider ||A — / m ||i < 1 and ||A — A||i < 1- Then 


lim Pf = lim P 2 n = A, 

n—>• 00 n—>• 00 


(26) 


where A is partitioned by index sets ai,..., 

= (l/|«i|)H T , for * 7 ^ j. (27) 

Proof. See appendix |B] □ 

We are now ready to prove theorem [2] 


Proof. We first want to show R = I 3 . Taking limits of (|24Jl, 
and using lemma [5] we get 


R -> 


n -> p/ p 1? p 2 


RlR> 


(19) 


Proof. 


I iiPi. - ^iP2f|| 2)1 + ||j 2 p 2 - $ 2 pa:|| 2)1 
liiPiP^Pi - T'ii?i? r i?x|| 2>1 + 
\i2P2P2P2 - ^ 2 rr t rx || 2> i 
|X}P 1 t P 1 -^iPP t X , || 2 , 1 + 
|X^P 2 t P 2 -^2PP t X'|| 2i1 . 


T-iPX' = T'lX'A, T' 2 PX' = ^ 2 X'A, (28) 

where B = lim^oo R" and A = lim^oo Pf = lim^oc Pf 
. We saw in lemma [3] A has multiple irreducible components, 
each being an averaging operator for the points of relevant 
indices. The equations lead to 


( 20 ) 


BX' = X'A. 


(29) 


The last second equality is due to the fact that P{Pf = 
Im,P 2 P 2 = Im,R T R = h, since permutation matrices are 
necessarily orthogonal matrices. Therefore, a solution after the 
change of data in (17) is P^Pi, PfA and PP T . □ 


Since B = lim^^oo P n = P lim n ^oo P n 1 = PP, multiply- 

(30) 


ing (|29j) from the left by P we get 

RX'A = X'A 


Since the map presented in ( [T9] ) is invertible, we know the 
solutions to conregl with 3D model X and image Xi,X 2 


Now we show P = I 3 for each of the following cases. 
Case 1: Suppose A has three or more than three irreducible 
components. Let the centroid of any three components be 
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[ci,c 2 ,c 3 ] G M 3x3 . By equation (30) R[ci,c 2 , c 3 ] = [ci,c 2 ,c 3 ]. 

By the assumption that columns of X' are generic, [ci,c 2 ,c 3 ] 
has full rank hence invertible. Thus R = I 3 . 

Case 2: Suppose A has two irreducible components. This 
means R[ci, c 2 ] = [ci, c 2 ] where ci and c 2 are independent by 
the assumption that points are generic. Let c\ and c 2 denote 
the normalized version of ci, c 2 . i? = 4/ Cl c 2 + A c 2 , where 
T r ClC2 denotes the projection operator to the plane containing 
ci,c 2 , A denotes the vector cross product and a some scalar 
constant with \a\ < 1. We know \a\ <1 since the largest 
singular value of R is less than 1, as 


conv(SO(3)) C conv(0(3)) = {O e R 3 x 3 |O t O r< h} 

(31) 

where conv(0(3)) is the convex hull of the orthogonal group in 
3D. Such characterization of the convex hull of the orthogonal 
group can be found in [30]. If \a\ = 1, then we indeed have 
proven R = / 3 . If \a\ < 1, we will arrive at a contradiction. In 
this situation B = lim n ^ oc R n = 4/ ClC2 - By the assumption 
that m > 4 and equation ( [29] ), there exists four points with 
coordinates Y = [Yi, Y 2 , Y 3 , Yf\ G M 3x4 such that either 


*C 1C2 V = [ci,ci,ci,c 2 ], or T^ci c 2 Y = [ci,ci,c 2 ,c 2 ]. (32) 


This means either Y\ , Y 2 , 13 form a line, or Yi, Y 2 , 13 , 14 form 
a plane and both cases violate the assumption that points are 
in generic positions. 

Case 3: Suppose A has a single irreducible component. In 
this case BX' = [ci,..., ci], where c\ is the center of the 
point cloud. It is easy to show that B = 4/ Cl , where 4/ Cl is 
the projection onto the line spanned by c\. However, ^ Cl X' = 
[ci,..., ci] implies columns of X' forms a plane with c\ being 
the normal vector. Again this violates the assumption of generic 
positions. 


Now since R = / 3 , from equations (22) and (23) we have 


[0 0 l]X' = [0 0 l]X'Pi, [0 0 1]X' = [0 0 1 \X'P 2 . (33) 


Again from the generic positions assumption, [0 0 l\X f G M m 
is a vector with distinct entries. It is known that in this situation, 
the two doubly stochastic matrices Pi, P 2 = I m (see corollary 


8 of (53). 


□ 


C. Exact recovery property of conreg2 

Let the solution of conreg2 be P, Pi, P 2 , Pi 2 - 

Theorem 3. Under our assumptions, the solution P, Pi, P 2 , P 2 
to conreg2 is unique, and R = P, P\ = Pi, P 2 = P 2 , Pi 2 = 
P 12 . 

Proof The recovery of P, Pi, P 2 can be proven in the exact 
manner as in previous subsection. If P 12 = Pi2» || Pi 2 — A 2 II > 
0 unless P 12 = Pi2- We therefore conclude our proof. □ 

Although it does not seem like the additional constraints in 
conreg2 helps our proof at this point, in practice in the case 
when ni,n 2 > n 3 , conreg2 demonstrates slight advantage 
over conregl in simulation in terms of stability (Fig. [4]}) 
and also in real data (Fig. [5]). 


VI. Clinical Application and Additional Features 

Though the problem of 2D/3D registration appears in several 
medical imaging applications, this paper is motivated by the 
clinical setting in which two or more fluoroscopic images of the 
coronaries are used for intervention guidance. In a procedure 
called percutaneous coronary intervention (PCI), commonly 
referred to as angioplasty, a cardiac interventionalist introduces 
a thin, flexible tube with a device on the end via arterial 
access (usually femoral) into the patient’s coronary arteries 
which may be blocked by calcification. To guide and position 
the device accurately, the cardiac interventionalist uses X-ray 
fluoroscopy with intermittent contrast injections (angiography). 
The resultant fluoroscopic images contain projected outlines of 
the coronary vessels and blockages if any. However, these intra¬ 
operative fluoroscopic images do not have the performance 
characteristic of pre-operative images such as Computed 
Tomography Angiography (CTA) with blood pool contrast 
injection that enables the occlusions causing calcifications to 
be clearly distinguished. The benefit of using pre-operative 
Coronary CTA images are well documented p2|-p5| and 
are becoming standard of care. Coronary CTA may also 
allow better determination of calcification, lesion length, stump 
morphology, definition of post-CABG anatomy, and presence 
of side branches compared to fluoroscopy or angiography 
alone [33]. 

To aid the cardiac interventionalist performing the procedure 
on the coronaries, a strategy to provide additional information 
by fusing pre-operative images with intra-operative fluoroscopic 
has been developed. However, finding a transformation to 
align pre-operative and intra-operative images is challenging 
since finding inter modal correspondences is a nontrivial 
problem. Typically to achieve this multi-modality fusion, 
vessel centerlines segmented from both modalities are used 
as landmarks for inter modal correspondences. A 3D model 
of the vessels extracted from pre-procedure CTA is aligned 
with the 2D fluoroscopic views, and overlaid on top of the 
live fluoroscopy images, thereby augmenting the interventional 
images. 

In this paper, we process the CTA images using the 
segmentation algorithm presented in (36) and fluoroscopic 
images using [371, (38) to get the respective point-sets. Thus, 
the clinical alignment transformation between pre-operative 
and intra-operative images is reduced to (REG1 ) or (REG2). 

Though this paper was motivated by the clinical scenario 
of CTO, our algorithms are generalized to handle other 
clinical situations where natural structures can be obtained by 
image segmentation methods. For instance, vessels from brain, 
liver, kidney or peripheral images can be used by processing 
them with relevant segmentation algorithms. As illustrated in 
Section ??, the algorithm is not restricted to point-sets from 
vessel like structures only. Other structures, notably surfaces, 
extracted by means of segmentation algorithm can be used 
to obtain desired point sets required by methods (REG1) 
or (REG2 ). Typically, the structure representations extracted 
from segmentation methods need to be re-sampled uniformly 
to result in respective point-sets. 





A. Incorporating additional features 

In the context of finding an alignment transformation between 
two point clouds, point descriptors maybe valuable by favoring 
transformation that matches the descriptors as well as the 
coordinates. Our proposed formulations easily allow the incor¬ 
poration of descriptors by adding additional terms in the cost 
of (REG1), (REG2) that encourage matching of transformation 
invariant features, or features transforming according to rotation. 
To match transformation invariant features, we simply add the 
following terms: 

||X F — ifPilla,! + \\X F -1[P 2 \\ 2A , (34) 

to our cost in (REG1) and (REG2). X F e R dxm , X[ G R dxn \ 
and Zf G M dxn2 denote some d dimensional feature vectors 
for the points in 3D model and the images, respectively. To deal 
with features that transform with rotation, such as intensity 
gradient, the following terms can be added to (REG1) and 
(REG2): 

\\<!’ 1 RX F -X[Pi\\ 2 ,i + \\V 2 RX f - X 2 f P 2 || 2i1 . (35) 


B. Constructing features for coronary vessel 

To incorporate features for the specific application of 
registering coronary vessel, we will leverage the idea of patch 
(39). We note that the typical features for vessels, such as the 
tangent of the vessel, are in general extracted from the local 
neighborhood of the particular point (40|. Therefore, we simply 
combine the coordinates of neighborhood points around each 
point, which we denote as patch, as the descriptor for each 
point. For the coordinate of point i, we concatenate it with the 
coordinate of subsequent points i+l,i+2,..., i+n p —1 to form 
a patch of size n p . We denote the result of such concatenation 
as Xf , X p • and Zf and we store it as the i-th column of the 
matrices X p G R 3 n P xrn and Zi G R 3 n P xn \ X 2 G R 3 n P xn \ 
Now instead of registering point, our goal is then to register 
these patches of the 3D model to the images. The cost in 
(REG1) when considering matching patches become 


II (I np ® *ii?)X p -If Pi || 2 ,1 + II (In p ® * 2 R)X p -1 f P 2 || 2 ,i , 

(36) 

and for (REG2) the cost is 

\\(I np (^^ l R)X p -l[P 1 \\ 2 , 1 + 

||(I np ®V 2 R)X P -2fP 2 || 2il + ||Pi 2 - A 2 II 1 - (37) 

In our experiments, we found n p = 3 suffices to improve 
the solution without introducing too many variables into the 
optimization. A practical issue is that one needs to take care 
of the sampling density of the points on the centerline. If the 
point spacing is very different between X and Zi, Z 2 , the 
patches cannot be well matched. To deal with this issue, we 
first project the randomly posed 3D model using \hi and ^ 2 , 
calculate the average spacing between two subsequent points in 
the projection, and sample the image points Zi, Z 2 according 
to this spacing. 



Fig. 1. An illustration of synthetic data used for simulations. Solid blue lines 
represent the centerlines on 2D and dashed red lines represent the projection 
of 3D model on the image plane. The left figure illustrates the case when the 
3D model is a proper subset. 
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Fig. 2. Simulation results of the two convex programs and OGMM (a) when 
the images are the projection of the complete 3D model, and (b) when the 
images contain the projections of the 3D model as a proper subset. Here we 
only plot the cases when OGMM finds a solution within 25 mm RMSD. If for 
a particular rotation range there is no boxplot it means no solution has less 
than 25 mm RMSD for that range. 


VII. Experiments 

In this section we evaluate the performance of our algorithm 
using synthetic data of the heart vessels, and real CT data of 
six different patients. In particular, we demonstrate through 
simulation that regardless of the initial pose of the 3D model, 
our algorithm exactly recovers the ground truth pose when 
there is no noise. In the presence of image noise, our algorithm 
is able to return a solution close to the ground truth. These 
observations are also verified when working with real data. For 
these experiments we solve conregl and conreg2 using 
the convex programming package CVX in MATLAB using 
interior point method. In general the time complexity of interior- 
point-method for conic programming is 0(N 3,5 ) where N 
denotes the total number of variables ED- The main source of 
complexity in our program comes from the stochastic matrices 
Pi and P 2 . Assuming there are m points, the time complexity 
is then 0(N 3 - 5 ) ^ 0(m 7 ). Since the interior-point-method 
is a second order method requiring the storage of Hessian, 
the space complexity is 0(N 2 ) ^ 0(m 4 ). While this looks 
intimidating, as reported in the upcoming sections it is quite 
fast in practice due to the fast implementation of the solvers 
in CVX. To measure the quality of registration, we use the 
root-mean-square-distance 

RMSD = + || V 2 R*X - VzRXWf). 

2\/ m 

(38) 

As a reminder, R* is the solution after rounding from 
conregl, conreg2, and R is the ground truth rotation. R 
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Fig. 3. (a) A simulation using the Stanford bunny dataset when there is 
no noise. The red points are taken from X \, and the blue points are taken 
from the model X posed by the solution from conregl. The pose is exactly 
recovered when there is no noise, (b) RMSD v.s. noise level for the bunny 
dataset. Each data point is averaged over 20 noise and pose realizations. The 
average spacing between the points is about 0.04. 


is known in the case of synthetic data, and for the case of 
real data it is manually given by medical experts. The unit of 
RMSD will be in millimeter. 

A. Synthetic data 

We use synthetic data to demonstrate the ability of our 
algorithm to exactly recover the pose of the 3D model when 
there is no noise in the image. We also add bounded noise to 
each point in the image in the following way 

X\ i = + E\j), X2 j = ^2 {RXj + £2 j), 

it n u j G 1,.. .,n 2 , (39) 

where eu, £ 2 j ~ U[—£,£] 3 and U [—£,£] 3 is the uniform 
distribution in the cube [—£,e] 3 . 

In order to run conreg2, we simply get Pi 2 from the 
epipolar constraints. For a point in image 1, denoted as x G M 3 
in homogeneous pixel coordinate, the epipolar line l in image 
2 can be computed as l = Fx, where F £ M 3x3 denotes the 
fundamental matrix. If point x' in image 2 satisfies \l T x'\ = 
\x t Fx'\ < 5 where S is some pre-defined threshold, we then 
set the entry in Pi 2 corresponding to the points x and x' to be 
1. We pick S to be 2 to 3 times the average spacing between 
the neighboring points. 

We first start with the Stanford bunny dataset to illustrate 
the exact recovery of pose from conregl and conreg2. 
We first project the bunny in x and //-directions to obtain X\ 
and X2, respectively. We then pose the model bunny X with 
arbitrary rotation and register it to its projections. In Fig. [3^, 
we show the registration results of conregl in one viewing 
direction with m = 473. When there is no noise, the exact 
recovery of pose is verified in Fig. [3] While we do not give 
guarantees on the approximate recovery of pose when there 
is noise, we also show in Fig. [ 3 J 3 that our proposed methods 
are stable when perturbed by noise. We note that for Fig. 
S>, we sparsify the number of points to m = 60 and the 
average distance between neighboring points is 0.04. Although 
conregl can solve registration problem with m = 473 in 
less than 10 seconds, the computational cost of conreg2 is 
high and we have to limit the number of points. 

For the registration of coronary vessels, we perform simu¬ 
lations for two different cases (Fig. [T]), namely, when the 2D 
images are exactly the projections of the 3D model and when 



Noise (dB) 

(a) r 



conregl 


conreg2 


Noise (dB) 
(b) 


Fig. 4. Simulation results of the two convex programs (a) when the images 
are the projection of the complete 3D model and (b) when the images contain 
the projections of the 3D model as a proper subset. For each noise level, we 
do 50 experiments with different poses of the 3D model, with rotation ranging 
from 0° to 90°. The RMSD gradually increases as the noise increases from 
no noise (00 dB) to 2 mm (9.6 dB). 


there are extra centerlines in the 2D images. In the latter case, 
the 3D model is a proper subset. In our simulations, we first 
rotate the 3D model of coronary vessels into some arbitrary pos¬ 
ture and project it in two fixed orthogonal directions. We tested 
our algorithm systematically when rotation R in ( [39} is within 
each of the ranges [0°, 10°], [10°,20°], [20°,40°], [40°,60°], 
and [60°,90°], using 10 different noise realizations in each 
case. 

We observed exact recovery in both programs when the noise 
level is zero, which confirms our proof in Section [V] Since our 
proposed methods are convex programs, the solution, in this 
case the ground truth rotation, does not depend on initialization. 
Fig. [2] shows the results of both programs with no noise added 
to the points with respect to the initial pose of the 3D model. 
In our experiments, we set the tolerance level for the convex 
solver to 10 -3 . Fig. [ 4 ] shows the results for both programs with 
bounded noise added to each point in the image. In this case 
the RMSD increases proportionally as the noise e increases 
from 0 to 2 mm for both input conditions. We observe that the 
latter input condition of extra centerlines in the image may be 
a better approximation of a typical clinical scenario as often 
non vascular objects such as catheters are mis-segmented as 
vessels. 

We compare our algorithm with the recent 2D/3D registration 
algorithm OGMM G3> |42| . In our experiments, we use our 
MATLAB implementation of GD for two images with each 
individual gaussian distribution being isotropic. If we are to use 
OGMM, using the identity as an initialization, the recovery of 
the pose is not guaranteed unless we are in the [0°, 10°] range. 
In fact for most experiment we cannot find solution within 25 
mm RMSD. We show the results from OGMM in Fig. [2j where 
we plot the RMSD against different 3D poses. We note that 
there are many ways one can improve on our implementation 
of OGMM in terms of radius of convergence, such as annealing 
to zero the variance of Gaussian distributions starting from a 
large value as proposed in G3- Nevertheless, it still remains 
that registration methods based on Gaussian mixtures method 
GD can get stuck in a local optimum due to the nature of the 
cost. 
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Patient Patient 

(a) (b) 

Fig. 5. (a) Results using the two convex programs conregl (Red), conreg2 
(Green), and OGMM (Blue) for data of 6 patients, (b) Number of instances for 
which the RMSD error was less than 10 mm (solid color), 15 mm (north east 
lines) and 25 mm (north west lines), respectively. For each of the 6 patients, 
for each rotation range we generate 10 different instances resulting in a total 
of 50 instances per patient. 


B. Real data 

To illustrate the application of our algorithm on clinical 
data, we have used pre-processed images from Chronic Total 
Occlusion (CTO) cases treated using PCI. Both our methods 
(REG1 ) and (REG2 ) operate on point-sets and hence any im¬ 
ages must be pre-processed to segment relevant structures such 
as vessels or organ surfaces. In case of CTO cases, the relevant 
structures to align are the centerlines of the coronaries as seen 
in pre-operative CTA images and intra-operative fluoroscopic 
images. To extract the centerlines in CTA images, we have used 
the segmentation algorithm presented in (36). The resultant 
centerlines also provide topological information allowing us to 
compute geometric features such as tangents. These centerlines 
were re-sampled at uniform interval to result in the 3D point- 
set for registration methods. The fluoroscopic images were 
processed using 2D vessel segmentation algorithms (37) , (38) 
to provide respective centerlines in the projection images. These 
were re-sampled to provide the 2D point-sets. The projection 
matrix was estimated from the C-arm parameters stored in the 
DICOM header of the image. 

We test our algorithm on six sets of clinical data in which 
a medical expert has aligned the 3D model to 2D fluoroscopic 
images. We consider this the ground truth to compare against 
our results. To characterize the recovery of our algorithm, 
we perturb the aligned 3D model by an arbitrary rotation 
within the ranges [0°, 10°], [10°, 20°], [20°,40°], [40°, 60°], 
and [60°, 90°]. For each rotation range, we generate 10 different 
instances. The results are detailed in figure [5] and [6] Again 
for comparison we use results from OGMM. We see that we 
are able to recover the pose to within 10 mm for most cases. 
Further, the recovery of pose in consistent irrespective of the 
initial pose. Whereas, in general we cannot obtain RMSD 
within 25 mm with OGMM. On average, there are 80 points 
in each model and image. The average running time for 
conregl and conreg2 are 0.4s and 15s, respectively on an 
Intel® CORE™ i7 2.2GHz running MATLAB®. conreg2 
is significantly slower than conregl due to the additional 
positive semidefinite matrix that encodes cycle consistency. 
The effect of additional point consistency terms between the 
images in conreg2 is more prominent in real patient data. 
We note that clinical data typically may have more number of 
ambiguous 3D model to projection point matches that cannot 
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(a) 



[0,10] [10,20] [20,40] [40,60] [60,90] 
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Fig. 6. (a) Results using the two convex programs conregl (Red), conreg2 
(Green), and OGMM (Blue) for data of 6 patients with respect to angle of pose, 
(b) Number of instances for which the RMSD error was less than 10 mm (solid 
color), 15 mm (north east lines) and 25 mm (north west lines), respectively 
with respect to angle of pose. For each of the 6 patients, for each rotation 
range we generate 10 different instances resulting in a total of 50 instances 
per patient. 


be resolved adequately using conregl without enforcing 
explicit matching between images. In such cases, the point 
consistency terms in conreg2 may aid in resolution of such 
ambiguities. We present the images of the 3D model after the 
2D/3D registration in Fig. [7] 

In Fig. [7] we not only perform registration using conregl 
and conreg2 but also further refine their solutions using 
gmmreg. Typically for convex relaxation techniques such as 
the one adopted in this paper, it is not unusual to prove the 
convex relaxed solutions to the noisy data lie in the vicinity of 
the ground truth (27), |43j, (44) . We, in fact, demonstrate 
via simulations with synthetic data that the solutions of 
algorithms conregl and congreg2 lie within certain radius 
(proportional to the noise magnitude) of the ground truth (Fig. 
[4}. Thus, local optimizers such as I CP or GMM based methods 
after conregl and conreg2 could be used to further refine 
the results. As local search based registration methods often 
fail to reach the global optimum due to non-convex nature of 
the cost, this combined approach ensures the ground truth is 
within the basin of convergence of any local methods. It is 
clear that this strategy brings improvement in row 1,3 and 4 
in Fig. [7] especially when registering using conregl. While 
adding a local refinement step brings little changes visually 
in row 2 and 5 and even seems counterproductive for row 
6 in Fig. [7] the RMSD when comparing with the ground 
truth given by the clinician is in fact lowered. For example, 
it is shown in Fig. [5^ that conregl and conreg2 fail to 
achieve 15 mm RMSD for images of patient 6. However after 
a local refinement all test cases for 6 patients achieve 15 mm 
RMSD (Table. [I]). The deterioration of registration quality for 
patient 6 in image plane A stems from the fact that the local 
refinement tries to fit the model to the vessels in both of the 
image planes. It may bring improvement to registration in one 
of the image plane while sacrificing the registration quality for 
the other image plane in order to lower the overall cost. Results 
combining algorithms conregl and congreg2 with OGMM 
and a variant of LM-ICP ED as local search methods are 
presented in Table [I] We include two different local methods 
to illustrate that the final results are not dependent on the 
local refinement method utilized. Though the desired accuracy 
of registration is highly dependent on application with some 
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applications desiring sub-millimeter accuracy and others being 
satisfied with centimeters, in order to facilitate a comparison 
between different algorithms, we bin the RMSD error into 
ranges. As shown in Table [TJ most cases fail to converge 
to within 25 mm when using the local methods alone. Our 
proposed method attain results well within the 10 mm for 
almost all cases, with sub-millimeter differences between the 
use of OGMM and LM-ICP as refinement methods. Though 
conreg2 has some instances with larger RMSD error, overall 
its performance is better than conregl in terms of median 
RMSD error. Both methods outperform using OGMM or the 
LM-ICP variant alone for robustness to initial starting point 
and median RMSD. 


TABLE I 

The number of experiments with RMSD within a given range on 6 

REAL DATA SETS. EACH PATIENT DATA SET WAS SIMULATED WITH A TOTAL 
OF 50 DIFFERENT POSES, FOR 6 PATIENTS. THE NUMBER IN BRACKET IS 
THE MEDIAN OF THE RMSD FOR SOLUTIONS WITHIN RESPECTIVE 
THRESHOLDS. THE RANGE OF RMSD IS REPORTED FOR ATTEMPTS WITH 
RMSD GREATER THAN 25 MM. 


Methods 

<10 mm 

< 15 mm 

>25 mm 

ICP 

OGMM 

19 (5.4) 

85 (4.57) 

22 (5.71) 

101 (5.14) 

269 [26.3, 237] 

165 [25.2, 738] 

conregl 

conregl+OGMM 

conregl+ICP 

252 (5.71) 

300 (3.80) 

291 (4.40) 

270 (5.71) 

300 (3.80) 

299 (4.73) 


conreg2 

conreg2+OGMM 

conreg2+ICP 

262 (3.58) 

291 (2.31) 

293 (2.56) 

262 (3.58) 

292 (2.31) 

300 (2.68) 



VIII. Discussion and Conclusion 

A new formulation for 2D/3D registration based on convex 
optimization programs has been proposed, and applied to the 
problem of registering a 3D centerline model of the coronary 
arteries with a pair of fluoroscopic images. The proposed opti¬ 
mization programs jointly optimize the correspondence between 
points and their projections, and the relative transformation. 
When the global optimum of the convex programs coincide 
with the solution to the 2D/3D registration problem, efficient 
search of the solution regardless of initialization can be done 
via standard off-the-shelf convex programming software. 

In the first program presented, we find the correspondence 
and transformation simultaneously between two degenerate 
point clouds obtained by projection of a 3D model and the 
model itself. In the second program presented, we solve a 
variant of the first program where a-priori information on the 
estimated correspondence between degenerate point clouds 
is exploited. Such estimate of correspondence could come 
from feature matching or known epipolar constraints. We 
show that under certain conditions a convexly relaxed versions 
of these programs converge to recover the solution of the 
original MINLP point-set registration program. This is in 
contrast to I CP -type algorithms, where the correspondences 
and transformation are optimized alternatively. A natural 


extension of both programs is to consider additional descriptors 
and features attached to a particular point. We presented 
a framework to incorporate either transformation invariant 
features or features that transform according to rotation such 
as gradients. We illustrate this formulation with the use of 
tangent of the vessel derived from a neighborhood patch. 

The proof of exact recovery for the noiseless case is presented 
and the theoretical exact recovery results corroborated using 
synthetic data. We also evaluated the performance of the 
proposed formulations using noisy observations obtained by 
adding uniformly distributed bounded noise to the point sets. 
These exhibit convergence to within a radius of the exact 
solution that is proportional to the added noise level. Both 
cases where the projected image fully captures the 3D model 
and cases when there are extra centerlines in the image were 
simulated. These mimic the clinical scenario where there may 
be surgical tools and devices such as catheters that have a 
similar appearance as that of a contrast filled vessel captured 
in the image. We find that under such conditions using local 
optimization programs such as OGMM alone, and using identity 
rotation as initialization, the recovery of the pose is not 
guaranteed unless we are in 0 — 10 degree range from the 
optimal solution. In fact, for most experiments we cannot find 
a solution that converges to within 25 mm RMSD error. We note 
that there are ways one can improve on the implementations 
of local optimizers in terms of range of convergence by using 
methods such as annealing. Nevertheless, it still remains that 
registration methods based on local search may fail to reach 
the global optimum due to non-convex nature of the cost. We 
hope to bridge the gap in such cases, in addition to providing 
an explicit correspondence match. 

Finally, experiments using multiple X-ray biplane angiog¬ 
raphy frames have also been presented. The validation was 
performed by perturbing the original pose by a random pose 
in a wide range of values. The pose and the correspondence 
can be recovered to within 10 mm RMSD error in most cases. 
Our algorithm can be used as a pre-processing step to provide 
a high quality starting point for a local registration algorithm 
such as OGMM or alone to provide recovery of transformation 
and correspondence. 

There are several ways our proposed method can be 
improved. As shown in our proof, a condition we need for 
exact recovery is that the point sets are generic, implying the 
centroids of X and Xi s are not zero. This limits our ability 
of including a translation by centering. A way to consider the 
translational degree of freedom is to include additional features. 
For example, we can use the tangent information in registration 
by introducing a cost in the form of || TgitTx — Tx.Pi|| 2 ,i, 
where Tx and Tx i in M 3xm are the tangent vectors computed 
for points in the model and the image. In this case, with 
high probability the tangent vectors are not “centered” and 
we can still apply our proposed methods. The introduction 
of an additional cost term that encourages matching of the 
invariant features as in Section VI-A can be helpful in similar 
way. We also note that during revision, a work in progress on 
solving the joint pose and correspondence problem using a 
tighter relaxation |46| is brought to our attention. According 
to the communication with the authors there is no restriction 
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OGMM only 


Before local refinement 


After local refinement 


OGMM only 



Before local refinement 


After local refinement 





Image Plane A 


Image Plane B 


Fig. 7. Each row is an illustrative sample result out of the 50 different initial poses used for registration for each single patient. The black lines are the 
segmented vessel centerline in the X-ray images. Column 1 and 4: Red lines denote the 3D model posed by OGMM. Column 2 and 5: Red lines denote the 3D 
model posed by 77* from conregl. Blue lines denote the 3D model posed by 77* from conreg2. Column 3 and 6: Red lines denote the 3D model posed 
by 77* from conregl + OGMM. Blue lines denote the 3D model posed by 77* from conreg2 + OGMM. For brevity the initial pose is omitted as it moves the 
model out of the field of view. The results with LM-ICP as the local refinement method are omitted as they are visually not distinct from the results using 
OGMM method. As noted in Table [1] the difference between the refinement from the two local methods is sub-millimeter. In this figure, the cases reported when 
using OGMM alone has error ranging from 12 mm to 40 mm. Except patient 6, the registration error of using the convex programs alone or with refinement are 
all below 10 mm. We refer the readers to Table [I] for a quantitative comparison of different methods. 


in considering the translational degree of freedom in their 
relaxation. Another similar problem is the need of equal number 
of points in both the model and image in the proof. While it is 
possible for our methods to work when the number of points 
are different, the performance is less stable and it is important 
to identify extra features. 

Since the focus of this paper is to introduce a novel 
formulation to the 2D/3D registration problem, we did not 
put an emphasis on designing a fast convex program solver. 
Currently we are using the CVX package, a library of interior 
point solvers for convex programming. Interior point method is 
inherently a second order method that requires high complexity 
computations involving the Hessian. It certainly does not scale 
as well as alternating minimization approaches such as I CP 
or first order methods such as GMM. Therefore it will be 
desirable to design fast first order optimization algorithms such 


as Augmented Lagrangian Method of Multiplier (ADMM) |47| 
or conditional gradient descent |48| that have low complexity 
to solve conregl and conreg2. 

Currently we only have exact recovery guarantees when 
there is no noise in the image. As mentioned before, when data 
is plagued by noise the solution of conregl and conreg2 
will no longer be the solution to (REG1) and (REG2). However, 
as observed in our numerical experiments, there is stability in 
the sense that ii* deviates gradually from the ground truth R. 
A theoretical establishment of such stability will establish the 
convex programs as approximation algorithms for solving the 
registration problem. 

Lastly, our proposal can only deal with rigid registration at 
this point. Since the consideration of non-rigid transformation 
will introduce further nonlinearity into the problem, a more 
elaborate convex relaxation has to be devised. Although we 
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have not experimented with such scenarios, we recommend 
passing the solution from conregl or conreg2 to a non- 
rigid registration method based on local searches for refinement 
purpose. 


that P k (i,j ) > 0. Equivalently ; if we regard P as the adjacency 
matrix of a directed graph, it means the graph corresponding 
P is strongly connected. 

We then state the fundamental theorem of Markov chain. 


Appendix A 


Proof The equations (22) and 
fact that 


simply follows from the 


o < W^iRX' - W'Alki + ||*2 RX' - M'Alki 
< ||*l/3*' - VlX'lmhl + ||* 2 /3^' - *2X'l m h fl 

= 0. (40) 

The second inequality follows from the optimality of Pi , P 2 
and R. 

Now 


Theorem 4 (Fundamental Theorem of Markov Chain (49)). If a 
Markov chain with transition matrix P G M mxm is irreducible 
and aperiodic, then 

lim P n (i, j) = 7Tj Mi = 1,..., m, (46) 

where the limiting distribution 1 r = [tti, ..., 7r m \ is the unique 
solution to the equation 


7 r = it P. 


(47) 


RX' = X'P 1 Pp 1 

RX' = X'P 2 +r ) 2 (41) 

where 771 G M 3xm , 772 C M 3xm satisfies *1771 = * 2^2 = 0. 
By induction, we have 

R n X’ = X'P? +rn(P ?- 1 + ... + P 1 +1) 

R n X’ = X , P^ + V 2 (P 2~ 1 + --- + P 2 + 1). (42) 

Multiplying these two equations by *1 and *2 respectively 
we get ( [24] ) and ( [25] ). □ 


Appendix B 

Proof Our goal is to show the solution P\.P 2 . R being 
7 m5 7 m ,/ 3 respectively. It suffices to prove uniqueness of 
7 m ,7 m ,/3 as solution to conregl in a local neighborhood, 
since for a convex program local uniqueness implies global 
uniqueness. Therefore we will assume || P% ~ I'm |It < 1 and 
|| P 2 — Im ||i < 1- We now show Pi,P 2 are doubly stochastic 
matrices. The constraints l T Pi = 1 T P2 = 1 T in conregl 
indicates 

mm mm 

DE^')) = = m. (43) 

i=l 3=1 i =1 3=1 

Since for every i = 1 p i(hj), YPjLi p 2 {i,j) < 

1, it has to be that 

m m 

En(*>■?)’ E = 1 for ever y* = 1,... ,m, (44) 

i=i j=i 

or else it violates ( [43] ). 

Based on these facts and regarding Pi,P 2 as transition 
matrices of a Markov chain, we will use the fundamental 
theorem of Markov chain to prove the lemma. Before stating the 
theorem, we need to give two standard definitions of stochastic 
processes. 

Definition 1. A Markov chain with transition matrix P G 
M mxm i s aperiodic if the period of every state is 1 , where the 
period of a state i, i — 1 ,..., m is defined as 

gcd {k : P k {i,i) > 0} (45) 

Definition 2. A Markov chain with transition matrix P G 
M mxm is irreducible if for all i,j, there exists some k such 


We are now ready to prove the lemma. From the assumption 
\\Pi - I m \\ 1 < 1 and || P 2 - I m ||i < 1, we have 

diag(Pi) > 0, diag(P 2 ) > 0. (48) 

This means the two doubly stochastic matrices Pi, P 2 are 
aperiodic. Furthermore we can decompose the matrices P\,P 2 
into irreducible components. Such decomposition essentially 
amounts to decomposing a graph into connected components, 
if we regard a doubly stochastic matrix as weighted adjacency 
matrix of a graph with m nodes. Now denote a±, a 2 , ..., a& 
and b \, b 2 ,..., bi as index sets of the irreducible components 
of matrix Pi and P 2 respectively. For such decomposition, we 
know Pi(a*,aj) = 0 for i j (similarly for P 2 ). Therefore 
for each aperiodic and irreducible block Pi(ai,af) of Pi and 
-^ 2 (pi: hf) of P 2 , we have 

{l/\ a i\)l T Pi(ai,a,i) = (l/|ai|)l T , 

(l/\bi\)l T P 2 (bi,bi) = (1/N)1 T (49) 


Applying fundamental theorem of Markov chain to each 
irreducible and aperiodic block along with equation (49), we 
have 


lim Pi n (a»,ai) = (l/|aj|)ll T , 

n—t oo 

lim P^(bi,bi) = (1/N)11 T (50) 

n—>• oo 

and zero for all other indices. We note that the multiplication 
X ai (l/\ai\)ll T results \af copies of the centroid of X ai . 
Hence the limit of each irreducible component of Pi, P 2 is an 
averaging operator. 

We now take limit of equations ( [24] ) and ( [25] ). Multiplying 
both ( [24| and ( [25] ) from the right with [0 0 1], and using the 
fact that [0 0 l]4>i = [0 0 l]4 r 2 = [0 0 1 ] we get 


[0 0 l\X lim Pf = [0 0 1 \X lim P” 


(51) 


Now assuming the point set X contains generic coordinates, 
we know that no two set of points from X have centers with 
same z coordinates. Combining this fact with (50) and ( [57] ), 
we get {ai, a 2 ,. - •, a k } = {bi, b 2 ,..., b t } and limn^oo Pf = 


lim P 71 


= A 


□ 
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